AITopics | regularization weight

Collaborating Authors

regularization weight

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ad84864002a72c344c2227d7eb8842b1-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-16-2026, 12:58:38 GMT

GPN applies a personalized page rank (PPR) module to diffuse the evidence among neighboring nodes.

activation function, artificial intelligence, machine learning, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)

Add feedback

df2a0ada77d0d126841ba2a2f67f875e-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 09:21:00 GMT

convergence, markov game, nash equilibrium, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > United States > Virginia (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.50)

Add feedback

ExpO

Gregory Plumb

Neural Information Processing SystemsFeb-9-2026, 00:18:04 GMT

agent, explanation, local explanation, (15 more...)

Neural Information Processing Systems

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

ad84864002a72c344c2227d7eb8842b1-Supplemental-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 04:41:41 GMT

GPN applies a personalized page rank (PPR) module to diffuse the evidence among neighboring nodes.

activation function, node, representation, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)

Add feedback

ExpO

Gregory Plumb

Neural Information Processing SystemsOct-3-2025, 07:08:21 GMT

agent, explanation, xp o-regularized model, (14 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

df2a0ada77d0d126841ba2a2f67f875e-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-19-2025, 12:17:39 GMT

artificial intelligence, machine learning, markov game, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > United States > Virginia > Montgomery County > Blacksburg (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Regularized Gradient Descent Ascent for Two-Player Zero-Sum Markov Games

Neural Information Processing SystemsAug-19-2025, 12:17:35 GMT

Exploiting this property, Daskalakis et al. [2020] builds on the theory of Lin et al. [2020a] and shows

artificial intelligence, machine learning, markov game, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > United States > Virginia (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.52)

Add feedback

Adaptive Dropout for Pruning Conformers

Kubo, Yotaro, Cai, Xingyu, Bacchiani, Michiel

arXiv.org Artificial IntelligenceDec-6-2024

This paper proposes a method to effectively perform joint training-and-pruning based on adaptive dropout layers with unit-wise retention probabilities. The proposed method is based on the estimation of a unit-wise retention probability in a dropout layer. A unit that is estimated to have a small retention probability can be considered to be prunable. The retention probability of the unit is estimated using back-propagation and the Gumbel-Softmax technique. This pruning method is applied at several application points in Conformers such that the effective number of parameters can be significantly reduced. Specifically, adaptive dropout layers are introduced in three locations in each Conformer block: (a) the hidden layer of the feed-forward-net component, (b) the query vectors and the value vectors of the self-attention component, and (c) the input vectors of the LConv component. The proposed method is evaluated by conducting a speech recognition experiment on the LibriSpeech task. It was shown that this approach could simultaneously achieve a parameter reduction and accuracy improvement. The word error rates improved by approx 1% while reducing the number of parameters by 54%.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2412.04836

Country: South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models

Deng, Wenlong, Zhao, Yize, Vakilian, Vala, Chen, Minghui, Li, Xiaoxiao, Thrampoulidis, Christos

arXiv.org Artificial IntelligenceOct-11-2024

Storing open-source fine-tuned models separately introduces redundancy and increases response times in applications utilizing multiple models. Delta-parameter pruning (DPP), particularly the random drop and rescale (DARE) method proposed by Yu et al., addresses this by pruning the majority of delta parameters--the differences between fine-tuned and pre-trained model weights--while typically maintaining minimal performance loss. However, DARE fails when either the pruning rate or the magnitude of the delta parameters is large. We highlight two key reasons for this failure: (1) an excessively large rescaling factor as pruning rates increase, and (2) high mean and variance in the delta parameters. To push DARE's limits, we introduce DAREx (DARE the eXtreme), which features two algorithmic improvements: (1) DAREx-q, a rescaling factor modification that significantly boosts performance at high pruning rates (e.g., >30 % on COLA and SST2 for encoder models, with even greater gains in decoder models), and (2) DAREx-L2, which combines DARE with AdamR, an in-training method that applies appropriate delta regularization before DPP. We also demonstrate that DAREx-q can be seamlessly combined with vanilla parameter-efficient fine-tuning techniques like LoRA and can facilitate structural DPP. Additionally, we revisit the application of importance-based pruning techniques within DPP, demonstrating that they outperform random-based methods when delta parameters are large. Through this comprehensive study, we develop a pipeline for selecting the most appropriate DPP method under various practical scenarios.

arxiv preprint arxiv, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2410.09344

Country: